%0 Generic
%D 2018
%T Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior
%A Antigoni-Maria Founta
%A Constantinos Djouvas
%A Despoina Chatzakou
%A Ilias Leontiadis
%A Jeremy Blackburn
%A Gianluca Stringhini
%A Athena Vakali
%A Michael Sirivianos
%A Nicolas Kourtellis
%X <p>In recent years, offensive, abusive and hateful language, sexism, racism and other types of aggressive and cyberbullying behavior have been manifesting with increased frequency, and in many online social media platforms. In fact, past scientific work focused on studying these forms in popular media, such as Facebook and Twitter. Building on such work, we present an 8-month study of the various forms of abusive behavior on Twitter, in a holistic fashion. Departing from past work, we examine a wide variety of labeling schemes, which cover different forms of abusive behavior, at the same time. We propose an incremental and iterative methodology, that utilizes the power of crowdsourcing to annotate a large scale collection of tweets with a set of abuse-related labels. In fact, by applying our methodology including statistical analysis for label merging or elimination, we identify a reduced but robust set of labels. Finally, we offer a first overview and findings of our collected and annotated dataset of 100 thousand tweets, which we make publicly available for further scientific exploration.</p>
%S ICWSM-18
%I AAAI
%C Stanford, California

%0 Generic
%D 2018
%T Smart Cities at Risk!: Privacy and Security Borderlines  from Social Networking in Cities
%A Moustaka, Vaia
%A Zenonas Theodosiou
%A Athena Vakali
%A Anastasis Kounoudes
%K online social networks
%K privacy threats
%K security threats
%K smart cities
%K smart living
%K smart people
%X <p class="rtejustify">As smart cities infrastructures mature, data becomes a valuable asset which can radically improve city services and tools. Registration, acquisition and utilization of data, which will be transformed into smart services, are becoming more necessary than ever. Online social networks with their enormous momentum are one of the main sources of urban data offering heterogeneous real-time data at a minimal cost. However, various types of attacks often appear on them, which risk users' privacy and affect their online trust. The purpose of this article is to investigate how risks on online social networks affect smart cities and study the differences between privacy and security threats with regard to smart people and smart living dimensions.</p>
%S WWW ’18 Companion
%I ACM
%C Lyon, France
%G eng
%( WWW ’18 Companion
%R https://doi.org/10.1145/3184558.3191516

%0 Conference Proceedings
%B Proceedings of the 26th International Conference on World Wide Web Companion
%D 2017
%T Detecting Aggressors and Bullies on Twitter
%A Despoina Chatzakou
%A Nicolas Kourtellis
%A Jeremy Blackburn
%A Emiliano De Cristofaro
%A Gianluca Stringhini
%A Athena Vakali
%K crowdsourcing
%K cyber-aggression
%K cyberbullying
%K Twitter
%X <p>Online social networks constitute an integral part of people's every day social activity and the existence of aggressive and bullying phenomena in such spaces is inevitable. In this work, we analyze user behavior on Twitter in an effort to detect cyberbullies and cuber-aggressors by considering specific attributes of their online activity using machine learning classifiers.</p>
%B Proceedings of the 26th International Conference on World Wide Web Companion
%S WWW '17 Companion
%I ACM
%C Perth, Australia
%P 767--768
%U http://dl.acm.org/citation.cfm?id=3054211
%R 10.1145/3041021.3054211

%0 Journal Article
%J Expert Systems with Applications
%D 2017
%T Detecting Variation of Emotions in Online Activities
%A Despoina Chatzakou
%A Athena Vakali
%A Konstantinos Kafetsios
%K Emotion detection
%K Hybrid process
%K Lexicon-based approach
%K Machine learning
%X <p>Online text sources form evolving large scale data repositories out of which valuable knowledge about human emotions can be derived. Beyond the primary emotions which refer to the global emotional signals, deeper understanding of a wider spectrum of emotions is important to detect online public views and attitudes. The present work is motivated by the need to test and provide a system that categorizes emotion in online activities. Such a system can be beneficial for online services, companies recommendations, and social support communities. The main contributions of this work are to: (a) detect primary emotions, social ones, and those that characterize general affective states from online text sources, (b) compare and validate different emotional analysis processes to highlight the most efficient, and (c) provide a proof of concept case study to monitor and validate online activity, both explicitly and implicitly. The proposed approaches are tested on three datasets collected from different sources, i.e., news agencies, Twitter, and Facebook, and on different languages, i.e., English and Greek. Study results demonstrate that the methodologies at hand succeed to detect a wider spectrum of emotions out of text sources.</p>
%B Expert Systems with Applications
%V 89
%P 318 - 332
%G eng
%U http://www.sciencedirect.com/science/article/pii/S0957417417305213
%R http://dx.doi.org/10.1016/j.eswa.2017.07.044

%0 Journal Article
%J Computers in Human Behavior
%D 2017
%T Experience of emotion in face to face and computer-mediated social interactions: An event sampling study
%A Konstantinos Kafetsios
%A Despoina Chatzakou
%A Nikolaos Tsigilis
%A Athena Vakali
%K Computer-mediated communication
%K Emotion
%K FtF
%K Internet
%K Social interaction
%X <p>The present study compared the experience of emotion in social interactions that take place face to face (FtF), co-presently, and those that take place online, in computer-mediated communications (CMC). For a period of ten days participants reported how intensely they experienced positive and negative emotions in CMC and in FtF interactions they had with persons from their social network. Results from factor analyses discerned a three factor emotion structure (positive, negative, and anxious emotions) that was largely shared between CMC and FtF social interactions. Multilevel analyses of emotion across modes of interaction found that in FtF social encounters participants experienced more positive and less negative emotion and higher satisfaction than in CMC; there was no difference in anxious emotion. Positive, but not negative emotions or anxiety partially mediated levels of satisfaction differences between interactions in CMC and those taking place FtF. The results point to similarities and differences in emotion experience in FtF and CMC, underlining in particular the affiliative function of positive emotion in peoples' encounters.</p>
%B Computers in Human Behavior
%V 76
%P 287 - 293
%G eng
%U http://www.sciencedirect.com/science/article/pii/S0747563217304557
%R https://doi.org/10.1016/j.chb.2017.07.033

%0 Conference Proceedings
%D 2017
%T Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter
%A Despoina Chatzakou
%A Nicolas Kourtellis
%A Jeremy Blackburn
%A Emiliano De Cristofaro
%A Gianluca Stringhini
%A Athena Vakali
%X <p>Over the past few years, online bullying and aggression have become increasingly prominent, and manifested in many different forms on social media. However, there is little work analyzing the characteristics of abusive users and what distinguishes them from typical social media users. In this paper, we start addressing this gap by analyzing tweets containing a great amount of abusiveness. We focus on a Twitter dataset revolving around the Gamergate controversy, which led to many incidents of cyberbullying and cyberaggression on various gaming and social media platforms. We study the properties of the users tweeting about Gamergate, the content they post, and the differences in their behavior compared to typical Twitter users.</p>    <p>We find that while their tweets are often seemingly about aggressive and hateful subjects, ``Gamergaters'' do not exhibit common expressions of online anger, and in fact primarily differ from typical users in that their tweets are less joyful. They are also more engaged than typical Twitter users, which is an indication as to how and why this controversy is still ongoing. Surprisingly, we find that Gamergaters are less likely to be suspended by Twitter, thus we analyze their properties to identify differences from typical users and what may have led to their suspension. We perform an unsupervised machine learning analysis to detect clusters of users who, though currently active, could be considered for suspension since they exhibit similar behaviors with suspended users. Finally, we confirm the usefulness of our analyzed features by emulating the Twitter suspension mechanism with a supervised learning method, achieving very good precision and recall.</p>
%S HT '17
%I ACM
%C Prague, Czech Republic
%G eng

%0 Conference Proceedings
%D 2017
%T Mean Birds: Detecting Aggression and Bullying on Twitter
%A Despoina Chatzakou
%A Nicolas Kourtellis
%A Jeremy Blackburn
%A Emiliano De Cristofaro
%A Gianluca Stringhini
%A Athena Vakali
%X <p>In recent years, bullying and aggression against users on social media have grown significantly, causing serious consequences to victims of all demographics. In particular, cyberbullying affects more than half of young social media users worldwide, and has also led to teenage suicides, prompted by prolonged and/or coordinated digital harassment. Nonetheless, tools and technologies for understanding and mitigating it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of cyberbullies and aggressors, and what features distinguish them from regular users. We find that bully users post less, participate in fewer online communities, and are less popular than normal users, while aggressors are quite popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, achieving over 90% AUC.</p>
%S WebSci '17
%I ACM
%C Troy, NY, USA
%G eng
%U https://arxiv.org/abs/1702.06877

%0 Conference Proceedings
%B Proceedings of the 26th International Conference on World Wide Web Companion
%D 2017
%T Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying
%A Despoina Chatzakou
%A Nicolas Kourtellis
%A Jeremy Blackburn
%A Emiliano De Cristofaro
%A Gianluca Stringhini
%A Athena Vakali
%X <p>Over the past few years, online aggression and abusive behaviors have occurred in many different forms and on a variety of platforms. In extreme cases, these incidents have evolved into hate, discrimination, and bullying, and even materialized into real-world threats and attacks against individuals or groups. In this paper, we study the Gamergate controversy. Started in August 2014 in the online gaming world, it quickly spread across various social networking platforms, ultimately leading to many incidents of cyberbullying and cyberaggression. We focus on Twitter, presenting a measurement study of a dataset of 340k unique users and 1.6M tweets to study the properties of these users, the content they post, and how they differ from random Twitter users. We find that users involved in this "Twitter war" tend to have more friends and followers, are generally more engaged and post tweets with negative sentiment, less joy, and more hate than random users. We also perform preliminary measurements on how the Twitter suspension mechanism deals with such abusive behaviors. While we focus on Gamergate, our methodology to collect and analyze tweets related to aggressive and bullying activities is of independent interest.</p>
%B Proceedings of the 26th International Conference on World Wide Web Companion
%S WWW '17 Companion
%I ACM
%C Perth, Australia
%P 1285-1290
%G eng
%U http://dl.acm.org/citation.cfm?id=3053890
%R 10.1145/3041021.3053890

%0 Book Section
%B Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece
%D 2016
%T A Distributed Framework for Early Trending Topics Detection on Big Social Networks Data Threads
%A Vakali, Athena
%A Kitmeridis, Nikolaos
%A Panourgia, Maria
%E Angelov, Plamen
%E Manolopoulos, Yannis
%E Iliadis, Lazaros
%E Roy, Asim
%E Vellasco, Marley
%X <p>Social networks have become big data production engines and their analytics can reveal insightful trending topics, such that hidden knowledge can be utilized in various applications and settings. This paper addresses the problem of popular topics’ and trends’ early prediction out of social networks data streams which demand distributed software architectures. Under an online time series classification model, which is implemented in a flexible and adaptive distributed framework, trending topics are detected. Emphasis is placed on the early detection process and on the performance of the proposed framework. The implemented framework builds on the lambda architecture design and the experimentation carried out highlights the usefulness of the proposed approach in early trends detection with high rates in performance and with a validation aligned with a popular microblogging service.</p>
%B Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece
%I Springer International Publishing
%C Cham
%P 186–194
%@ 978-3-319-47898-2
%G eng
%U http://dx.doi.org/10.1007/978-3-319-47898-2_20
%R 10.1007/978-3-319-47898-2_20

%0 Conference Paper
%D 2016
%T Early Malicious Activity Discovery in Microblogs by Social Bridges Detection
%A Antonia Gogoglou
%A Zenonas Theodosiou
%A Tasos Kounoudes
%A Athena Vakali
%A Yannis Manolopoulos
%X <p>With the emerging and intense use of Online Social Networks (OSNs) amongst young children and teenagers (youngters), safe networking and socializing on the Web has faced extensive scrutiny. Content and interactions which are considered safe for adult OSN users, might embed potentially threatening and malicious information when it comes to underage users. This work is motivated by the strong need to safeguard youngsters OSNs experience such that they can be empowered and aware. The topology of a graph is studied towards detecting the so called social bridges, i.e. the group(s) of malicious users and their supporters, who have links and ties to both honest and malicious user communities. A graph-topology based classification scheme is proposed to detect such bridge linkages which are suspicious for threatening youngsters networking vulnerability. The proposed scheme is validated by a Twitter network, at which potentially dangerous users are identified based on their Twitter connections. The achieved performance is higher compared to previous efforts, despite the increased complexity due to the variety of groups identified as malicious.</p>
%I 16th International Symposium on Signal Processing and Information Technology
%C Limassol, Cyprus
%G eng

%0 Conference Paper
%B Workshop on Real-time & Stream Analytics in Big Data
%D 2016
%T A multi-layer software architecture framework for adaptive real-time analytics
%A Athena Vakali
%A Paschalis Korosoglou
%A Pavlos Daoglou
%K big data analytics
%K cloud based services
%K real time data management
%K software architecutures
%X <p>Highly distributed applications dominate today’s software industry posing new challenges for novel software architectures capable of supporting real time processing and analytics. The proposed framework, so called REAλICS, is motivated by the fact that the demand for aggregating current and past big data streams requires new software methodologies, platforms and services. The proposed framework is designed to tackle with data intensive problems in real time environments, via services built dynamically under a fully scalable and elastic Lambda based architecture. REAλICS proposes a multi-layer software platform, based on the lambda architecture paradigm, for aggregating and synchronizing real time and batch processing. The proposed software layers and adaptive components support quality of experience, along with community driven software development. Flexibility and elasticity are targeted by hiding the complexity of bootstrapping and maintaining a multi level architecture, upon which the end user can drive queries over input data streams. REAλICS proposes a flexible and extensible software architecture that can capture<br />  users preference at the front-end and adapHighly distributed applications dominate today’s software industry posing new challenges for novel software architectures capable of supporting real time processing and analytics. The proposed framework, so called REAλICS, is motivated by the fact that the demand for aggregating current and past big data streams requires new software methodologies, platforms and services. The proposed framework is designed to tackle with data intensive problems in real time environments, via services built dynamically under a fully scalable and elastic Lambda based architecture. REAλICS proposes a multi-layer software platform, based on the lambda architecture paradigm, for aggregating and synchronizing real time and batch<br />  processing. The proposed software layers and adaptive components support quality of experience, along with community<br />  driven software development. Flexibility and elasticity are targeted by hiding the complexity of bootstrapping and maintaining a multi level architecture, upon which the end user can drive queries over input data streams. REAλICS proposes a flexible and extensible software architecture that can capture users preference at the front-end and adapt the appropriate distributed technologies and processes at the back-end. Such a model enables real time analytics in large-scale data driven cloud-based systems.t the appropriate distributed technologies and processes at the back-end. Such a model enables real time analytics in large-scale data driven cloud-based systems.</p>
%B Workshop on Real-time & Stream Analytics in Big Data
%C Washington D.C.
%G eng

%0 Conference Proceedings
%B Advances in Intelligent Systems and Computing
%D 2015
%T New Trends in Database and Information Systems II - Selected papers of the 18th East European Conference on Advances in Databases and Information Systems and Associated Satellite Events, ADBIS 2014 Ohrid, Macedonia, September 7-10, 2014 Proceedings II
%E Nick Bassiliades
%E Mirjana Ivanovic
%E Margita Kon-Popovska
%E Yannis Manolopoulos
%E Themis Palpanas
%E Goce Trajcevski
%E Athena Vakali
%B Advances in Intelligent Systems and Computing
%S 
%I Springer
%V 312
%@ 978-3-319-10517-8
%G eng
%U http://dx.doi.org/10.1007/978-3-319-10518-5
%R 10.1007/978-3-319-10518-5

%0 Journal Article
%J Multimedia Tools Appl.
%D 2014
%T Collaborative event annotation in tagged photo collections
%A Christos Zigkolis
%A Symeon Papadopoulos
%A Filippou, George
%A Yiannis Kompatsiaris
%A Athena Vakali
%K Event authoring
%K Ground truth generation
%K Multimedia annotation
%X <p>Events constitute a significant means of multimedia content organizationand sharing. Despite the recent interest in detecting events and annotating mediacontent in an event-centric way, there is currently insufficient support for managingevents in large-scale content collections and limited understanding of the eventannotation process. To this end, this paper presents CrEve, a collaborative eventannotation framework which uses content found in social media sites with theprime objective to facilitate the annotation of large media corpora with eventinformation. The proposed annotation framework could significantly benefit socialmedia research due to the proliferation of event-related user-contributed content.We demonstrate that, compared to a standard â€śbrowse-and-annotateâ€ť interface,CrEve leads to a 19% increase in the coverage of the generated ground truth in alarge-scale annotation experiment. Furthermore, the paper discusses the results of auser study that quantifies the performance of CrEve and the contribution of differentevent dimensions in the event annotation process. The study confirms the prevalenceof spatio-temporal queries as the prime option of discovering event-related contentin a large collection. In addition, textual queries and social cues (content contributor) were also found to be significant as event search dimensions. Finally, it demonstratesthe potential of employing automatic photo clustering methods with the goal offacilitating event annotation.</p>
%B Multimedia Tools Appl.
%V 70
%P 89-118
%G eng

%0 Conference Paper
%B ICE-B
%D 2014
%T A Conceptual Enterprise Architecture Framework for Smart Cities - A Survey Based Approach
%A Kakarontzas, George
%A Anthopoulos, Leonidas G.
%A Despoina Chatzakou
%A Athena Vakali
%E Obaidat, Mohammad S.
%E Holzinger, Andreas
%E van Sinderen, Marten
%E Dolog, Peter
%B ICE-B
%I SciTePress
%P 47-54
%@ 978-989-758-043-7
%G eng

%0 Conference Proceedings
%B ADBIS (2)
%D 2014
%T New Trends in Databases and Information Systems, 17th East European Conference on Advances in Databases and Information Systems
%E Barbara Catania
%E Cerquitelli, Tania
%E Chiusano, Silvia
%E Guerrini, Giovanna
%E Kämpf, Mirko
%E Kemper, Alfons
%E Novikov, Boris
%E Palpanas, Themis
%E Pokorny, Jaroslav
%E Athena Vakali
%B ADBIS (2)
%S Advances in Intelligent Systems and Computing
%I Springer
%C Genoa, Italy
%V 241
%8 04/2013
%@ 978-3-319-01863-8
%G eng

%0 Conference Paper
%B WIMS
%D 2014
%T Smart Cities Data Streams Integration: experimenting with Internet of Things and social data flows
%A Athena Vakali
%A Anthopoulos, Leonidas G.
%A Krco, Srdjan
%E Akerkar, Rajendra
%E Bassiliades, Nick
%E Davies, John
%E Ermolayev, Vadim
%B WIMS
%I ACM
%P 60
%@ 978-1-4503-2538-7
%G eng

%0 Conference Proceedings
%B T. Large-Scale Data- and Knowledge-Centered Systems
%D 2014
%T Transactions on Large-Scale Data- and Knowledge-Centered Systems
%E Hameurlain, Abdelkader
%E Küng, Josef
%E Wagner, Roland
%E Barbara Catania
%E Guerrini, Giovanna
%E Palpanas, Themis
%E Pokorny, Jaroslav
%E Athena Vakali
%B T. Large-Scale Data- and Knowledge-Centered Systems
%S Lecture Notes in Computer Science
%I Springer
%V 8920
%@ 978-3-662-45760-3
%G eng

%0 Conference Paper
%B ADBIS
%D 2013
%T Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Indexing
%A Kastrinakis, Dimitrios
%A Symeon Papadopoulos
%A Athena Vakali
%E Barbara Catania
%E Guerrini, Giovanna
%E Pokorny, Jaroslav
%K composite visual word
%K local descriptors
%K multimedia data indexing
%K visual word
%X <p>Multimedia data indexing for content-based retrieval has attractedsignificant attention in recent years due to the commoditizationof multimedia capturing equipment and the widespread adoption of social networking platforms as means for sharing media content online. Due to the very large amounts of multimedia content, notably images, produced and shared online by people, a very important requirement for multimedia indexing approaches pertains to their efficiency both in terms of computation and memory usage. A common approach to support query-by-example image search is based on the extraction of visual words from images and their indexing by means of inverted indices, a method proposed and popularized in the field of text retrieval.The main challenge that visual word indexing systems currently facearises from the fact that it is necessary to build very large visual vocabularies (hundreds of thousands or even millions of words) to support sufficiently precise search. However, when the visual vocabulary is large,the image indexing process becomes computationally expensive due to the fact that the local image descriptors (e.g. SIFT) need to be quantized to the nearest visual words.To this end, this paper proposes a novel method that significantly decreases the time required for the above quantization process. Instead of using hundreds of thousands of visual words for quantization, the proposed method manages to preserve retrieval quality by using a much smaller number of words for indexing. This is achieved by the concept of composite words, i.e. assigning multiple words to a local descriptor in ascending order of distance. We evaluate the proposed method in the Oxford and Paris buildings datasets to demonstrate the validity of the proposed approach.</p>
%B ADBIS
%S Lecture Notes in Computer Science
%I Springer
%V 8133
%P 98-111
%@ 978-3-642-40682-9
%G eng

%0 Conference Paper
%B ICDM Workshops
%D 2013
%T Dissimilarity Features in Recommender Systems
%A Christos Zigkolis
%A Karagiannidis, Savvas
%A Athena Vakali
%E Wei Ding
%E Washio, Takashi
%E Xiong, Hui
%E Karypis, George
%E Thuraisingham, Bhavani M.
%E Cook, Diane J.
%E Wu, Xindong
%B ICDM Workshops
%I IEEE Computer Society
%P 825-832
%@ 978-0-7695-5109-8
%G eng

%0 Journal Article
%J Expert Syst. Appl.
%D 2013
%T Integrating similarity and dissimilarity notions in recommenders
%A Christos Zigkolis
%A Karagiannidis, Savvas
%A Koumarelas, Ioannis K.
%A Athena Vakali
%K Dissimilarity recommender
%K Distributed framework
%K Recommender systems
%X <p>Collaborative recommenders rely on the assumption that similar users may exhibit similar tastes whilecontent-based ones favour items that found to be similar with the items a user likes. Weak related entities,which are often considered to be useful, are neglected by those similarity-driven recommenders. Totake advantage of this neglected information, we introduce a novel dissimilarity-based recommenderthat bases its estimations on degrees of dissimilarities among itemsâ€™ attributes. However, instead of usingthe proposed recommender as a stand-alone method, we combine it with similarity-based ones to maintainthe selective nature of the latter while detecting, through our recommender, information that mayhave been overlooked. Such combinations are established by IANOS, a proposed framework throughwhich we increase the accuracy of two popular similarity-based recommenders (Naive Bayes andSlope-One) after their combination with our algorithm. Improved accuracy results in experimentationon two datasets (Yahoo! Movies and Movielens) enhance our reasoning. However, the proposed recommendercomes with an additional computational complexity when combined with other techniques. Byusing Hadoop technology, we developed a distributed version of IANOS through which execution timewas reduced. Evaluation on IANOS procedures in terms of time performance endorses the use of distributedimplementations.</p>
%B Expert Syst. Appl.
%V 40
%P 5132-5147
%G eng

%0 Conference Paper
%B Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on
%D 2013
%T Micro-blogging Content Analysis via Emotionally-Driven Clustering
%A Despoina Chatzakou
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Konstantinos Kafetsios
%K affective analysis methodology
%K Clustering algorithms
%K content management
%K content sharing
%K Dictionaries
%K emotion intensity monitoring
%K emotionally-driven clustering
%K Equations
%K human emotion states
%K information sharing
%K lexicon-based technique
%K Mathematical model
%K microblogging content analysis
%K pattern clustering
%K people perception
%K Pragmatics
%K Semantics
%K Sentiment analysis
%K social networking (online)
%K social pulse
%K social relations
%K text analysis
%K Twitter
%B Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on
%P 375-380
%8 Sept
%G eng
%R 10.1109/ACII.2013.68

%0 Conference Paper
%B Panhellenic Conference on Informatics
%D 2013
%T Requirements and architecture design principles for a smart city experiment with sensor and social networks integration
%A Samaras, Christos
%A Athena Vakali
%A Maria Giatsoglou
%A Despoina Chatzakou
%A Angelis, Lefteris
%E Ketikidis, Panayiotis H.
%E Margaritis, Konstantinos G.
%E Vlahavas, Ioannis P.
%E Chatzigeorgiou, Alexander
%E Eleftherakis, George
%E Stamelos, Ioannis
%B Panhellenic Conference on Informatics
%I ACM
%P 327-334
%@ 978-1-4503-1969-0
%G eng

%0 Conference Paper
%B MMM (1)
%D 2013
%T Semi-supervised Concept Detection by Learning the Structure of Similarity Graphs
%A Symeon Papadopoulos
%A Sagonas, Christos
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Li, Shipeng
%E El-Saddik, Abdulmotaleb
%E Wang, Meng
%E Mei, Tao
%E Sebe, Nicu
%E Yan, Shuicheng
%E Hong, Richang
%E Gurrin, Cathal
%X <p>We present an approach for detecting concepts in images bya graph-based semi-supervised learning scheme. The proposed approach builds a similarity graph between both the labeled and unlabeled images of the collection and uses the Laplacian Eigemaps of the graph as features for training concept detectors. Therefore, it offers multiple options for fusing different image features. In addition, we present an incremental learning scheme that, given a set of new unlabeled images, efficiently performs the computation of the Laplacian Eigenmaps. We evaluate the performance of our approach both on synthetic datasets and on MIR Flickr, comparing it with high-performance state-of-the-art learning schemes with competitive and in some cases superior results.</p>
%B MMM (1)
%S Lecture Notes in Computer Science
%I Springer
%V 7732
%P 1-12
%@ 978-3-642-35725-1
%G eng

%0 Conference Paper
%B DATA
%D 2013
%T Social Data Sentiment Analysis in Smart Environments - Extending Dual Polarities for Crowd Pulse Capturing
%A Athena Vakali
%A Despoina Chatzakou
%A Vassiliki A. Koutsonikola
%A Andreadis, George
%E Helfert, Markus
%E Francalanci, Chiara
%E Filipe, Joaquim
%B DATA
%I SciTePress
%P 175-182
%@ 978-989-8565-67-9
%G eng

%0 Journal Article
%J J. Intell. Inf. Syst.
%D 2012
%T In & out zooming on time-aware user/tag clusters
%A Giannakidou, Eirini
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Yiannis Kompatsiaris
%K Events
%K Social tagging systems
%K Time-aware clustering
%K Users' interests over time
%X <p>The common ground behind most approaches that analyze social taggingsystems is addressing the information challenge that emerges from the massiveactivity of millions of users who interact and share resources and/or metadata online.However, lack of any time-related data in the analysis process implicitly deniesmuch of the dynamic nature of social tagging activity. In this paper we claim thatholding a temporal dimension, allows for tracking macroscopic and microscopicusersâ€™ interests, detecting emerging trends and recognizing events. To this end, wepropose a time-aware co-clustering approach for acquiring semantic and temporalpatterns out of the tagging activity. The resulted clusters contain both users and tagsof similar patterns over time, and reveal non-obvious or â€śhiddenâ€ť relations amongusers and topics of their common interest. Zoom in &amp; out views serve as visualizationmethods on different aspects of the clustersâ€™ structure, in order to evaluate theefficiency of the approach.</p>
%B J. Intell. Inf. Syst.
%V 38
%P 685-708
%G eng

%0 Conference Paper
%B Future Internet Assembly
%D 2012
%T Towards a Narrative-Aware Design Framework for Smart Urban Environments
%A Srivastava, Lara
%A Athena Vakali
%E Alvarez, Federico
%E Cleary, Frances
%E Daras, Petros
%E Domingue, John
%E Galis, Alex
%E Garcia, Ana
%E Gavras, Anastasius
%E Karnouskos, Stamatis
%E Krco, Srdjan
%E Li, Man-Sze
%E Lotz, Volkmar
%E Müller, Henning
%E Salvadori, Elio
%E Sassen, Anne-Marie
%E Schaffers, Hans
%E Stiller, Burkhard
%E Tselentis, Georgios
%E Turkama, Petra
%E Zahariadis, Theodore B.
%B Future Internet Assembly
%S Lecture Notes in Computer Science
%I Springer
%V 7281
%P 166-177
%@ 978-3-642-30240-4
%G eng

%0 Conference Paper
%B Future Internet Assembly
%D 2012
%T Urban Planning and Smart Cities: Interrelations and Reciprocities
%A Anthopoulos, Leonidas G.
%A Athena Vakali
%E Alvarez, Federico
%E Cleary, Frances
%E Daras, Petros
%E Domingue, John
%E Galis, Alex
%E Garcia, Ana
%E Gavras, Anastasius
%E Karnouskos, Stamatis
%E Krco, Srdjan
%E Li, Man-Sze
%E Lotz, Volkmar
%E Müller, Henning
%E Salvadori, Elio
%E Sassen, Anne-Marie
%E Schaffers, Hans
%E Stiller, Burkhard
%E Tselentis, Georgios
%E Turkama, Petra
%E Zahariadis, Theodore B.
%B Future Internet Assembly
%S Lecture Notes in Computer Science
%I Springer
%V 7281
%P 178-189
%@ 978-3-642-30240-4
%G eng

%0 Conference Paper
%B MediaEval
%D 2011
%T CERTH @ MediaEval 2011 Social Event Detection Task
%A Symeon Papadopoulos
%A Christos Zigkolis
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Larson, Martha
%E Rae, Adam
%E Demarty, Claire-Helene
%E Kofler, Christoph
%E Metze, Florian
%E Troncy, Raphaël
%E Mezaris, Vasileios
%E Jones, Gareth J. F.
%X <p>This paper describes the participation of CERTH in the â€śSocialEvent Detection Task @ MediaEval 2011â€ť, which aimsat discovering social events in a large photo collection. Thetask comprises two challenges: (i) identification of soccerevents in the cities of Barcelona and Rome, and (ii) identificationof events taking place in two specific venues. Weadopt an approach that combines spatial and temporal filterswith tag-based location classification models and an ef-ficient photo clustering method. In our best runs, we achieveF-measure and NMI scores of 77.4% and 0.63 respectivelyfor Challenge 1, and 64% and 0.38 for Challenge 2.</p>
%B MediaEval
%S CEUR Workshop Proceedings
%I CEUR-WS.org
%V 807
%G eng

%0 Conference Paper
%B ICMR
%D 2011
%T City exploration by use of spatio-temporal analysis and clustering of user contributed photos
%A Symeon Papadopoulos
%A Christos Zigkolis
%A Kapiris, Stefanos
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Natale, Francesco G. B. De
%E Bimbo, Alberto Del
%E Hanjalic, Alan
%E Manjunath, B. S.
%E Satoh, Shin’ichi
%K Clustering
%K content browsing
%K landmark/event detection
%K spatio-temporal mining
%X <p>We present a technical demonstration of an online city explorationapplication that helps users identify interesting spotsin a city by use of spatio-temporal analysis and clusteringof user contributed photos. Our framework analyzes thespatial distribution of large city-centered collections of usercontributed photos at different time scales in order to indexthe most popular spots of a city in a time-aware manner.Subsequently, the photo sets belonging to the same spatiotemporalcontext are clustered in order to extract representativephotos for each spot. The resulting applicationenables users to obtain flexible summaries of the most importantspots in a city given a temporal slice (time of theday, month, season). The demonstration will be based on aphoto dataset covering major European cities.</p>
%B ICMR
%I ACM
%P 65
%@ 978-1-4503-0336-1
%G eng

%0 Journal Article
%J IEEE MultiMedia
%D 2011
%T Cluster-Based Landmark and Event Detection for Tagged Photo Collections
%A Symeon Papadopoulos
%A Christos Zigkolis
%A Yiannis Kompatsiaris
%A Athena Vakali
%X <p>The rising popularity of photosharingapplications on the Webhas led to the generation of hugeamounts of personal image collections.Browsing through image collections ofsuch magnitude is currently supported by theuse of tags. However, tags suffer from severallimitationsâ€”such as polysemy, lack of uniformity,and spamâ€”thus not presenting an adequatesolution to the problem of contentorganization. Therefore, automated contentorganizationmethods are of particular importanceto improve the content-consumptionexperience. Because itâ€™s common for users to associatetheir photo-captured experiences withsome landmarksâ€”for example, a tourist site oran event, such as a music concert or a gatheringwith friendsâ€”we can view landmarks andevents as natural units of organization forlarge image collections. Itâ€™s for this reasonthat automating the process of detecting suchconcepts in large image sets can enhance theexperience of accessing massive amounts ofpictorial content.In this article, we present a novel scheme forautomatically detecting landmarks and eventsin tagged image collections. Our proposal isbased on the simple yet elegant concept ofimage similarity graphs as a means of combiningmultiple notions of similarity betweenimages in a photo collection; in our case, weuse visual and tag similarity. We perform clusteringon such image similarity graphs bymeans of community detection,1 a processthat identifies on the graph groups of nodesthat are more densely connected to eachother than to the rest of the network. In contrastto conventional clustering schemes suchas k-means or hierarchical agglomerative clustering,community detection is computationallymore efficient and doesnâ€™t require thenumber of clusters to be provided as input. Subsequently,we classify the resulting image clustersas landmarks or events by use of featuresrelated to the temporal, social, and tag characteristicsof image clusters. In the case of landmarks,we also conduct a cluster-merging stepon the basis of spatial proximity to enrich ourlandmark model.</p>
%B IEEE MultiMedia
%V 18
%P 52-63
%G eng

%0 Journal Article
%J TWEB
%D 2011
%T A Clustering-Driven LDAP Framework
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%K Clustering
%K DIT organization
%K LDAP services
%K merging criteria
%K query and retrieval engine
%X <p>LDAP directories have proliferated as the appropriate storage framework for various and heterogeneousdata sources, operating under a wide range of applications and services. Due to the increased amount andheterogeneity of the LDAP data, there is a requirement for appropriate data organization schemes. TheLPAIR &amp; LMERGE (LP-LM) algorithm, presented in this article, is a hierarchical agglomerative structurebasedclustering algorithm which can be used for the LDAP directory information tree definition. A thoroughstudy of the algorithmâ€™s performance is provided, which designates its efficiency. Moreover, the RelativeLink as an alternative merging criterion is proposed, since as indicated by the experimentation, it canresult in more balanced clusters. Finally, the LP and LM Query Engine is presented, which considering theclustering-based LDAP data organization, results in the enhancement of the LDAP serverâ€™s performance.</p>
%B TWEB
%V 5
%P 12
%G eng

%0 Book Section
%B Social Media Modeling and Computing
%D 2011
%T Combining Multi-modal Features for Social Media Analysis
%A Nikolopoulos, Spiros
%A Giannakidou, Eirini
%A Yiannis Kompatsiaris
%A Patras, Ioannis
%A Athena Vakali
%E Hoi, Steven C. H.
%E Luo, Jiebo
%E Boll, Susanne
%E Xu, Dong
%E Jin, Rong
%B Social Media Modeling and Computing
%I Springer
%P 71-96
%@ 978-0-85729-435-7
%G eng

%0 Book Section
%B Community-Built Databases
%D 2011
%T Community Detection in Collaborative Tagging Systems
%A Symeon Papadopoulos
%A Athena Vakali
%A Yiannis Kompatsiaris
%E Pardede, Eric
%B Community-Built Databases
%I Springer
%P 107-131
%@ 978-3-642-19046-9
%G eng

%0 Conference Paper
%B CBMI
%D 2011
%T Detecting the long-tail of Points of Interest in tagged photo collections
%A Christos Zigkolis
%A Symeon Papadopoulos
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Martinez, José M.
%X <p>The paper tackles the problem of matching the photosof a tagged photo collection to a list of â€ślong-tailâ€ť PointsOf Interest (PoIs), that is PoIs that are not very popularand thus not well represented in the photo collection. Despitethe significance of improving â€ślong-tailâ€ť PoI photoretrieval for travel applications, most landmark detectionmethods to date have been tested on very popular landmarks.In this paper, we conduct a thorough empirical analysiscomparing four baseline matching methods that relyon photo metadata, three variants of an approach that usescluster analysis in order to discover PoI-related photo clusters,and a real-world retrieval mechanism (Flickr search)on a set of less popular PoIs.A user-based evaluation of the aforementioned methodsis conducted on a Flickr photo collection of over 100, 000photos from 10 well-known touristic destinations in Greece.A set of 104 â€ślong-tailâ€ť PoIs is collected for these destinationsfrom Wikipedia, Wikimapia and OpenStreetMap. Theresults demonstrate that two of the baseline methods outperformFlickr search in terms of precision and F-measure,whereas two of the cluster-based methods outperform it interms of recall and PoI coverage. We consider the results ofthis study valuable for enhancing the indexing of pictorialcontent in social media sites.</p>
%B CBMI
%I IEEE
%P 235-240
%@ 978-1-61284-433-6
%G eng

%0 Conference Paper
%B ACII (1)
%D 2011
%T Emotional Aware Clustering on Micro-blogging Sources
%A Tsagkalidou, Katerina
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Konstantinos Kafetsios
%E D’Mello, Sidney K.
%E Graesser, Arthur C.
%E Schuller, Björn
%E Martin, Jean-Claude
%K Microblogging services
%K Sentiment analysis
%K web clustering
%X <p>Microblogging services have nowadays become a very popularcommunication tool among Internet users. Since millions of usersshare opinions on different aspects of life everyday, microblogging websites are considered as a credible source for exploring both factual and subjective information. This fact has inspired research in the area of automatic sentiment analysis. In this paper we propose an emotional aware clustering approach which performs sentiment analysis of users tweets onthe basis of an emotional dictionary and groups tweets according to the degree they express a specific set of emotions. Experimental evaluations on datasets derived from Twitter prove the efficiency of the proposed approach.</p>
%B ACII (1)
%S Lecture Notes in Computer Science
%I Springer
%V 6974
%P 387-396
%@ 978-3-642-24599-2
%G eng

%0 Book Section
%B Next Generation Data Technologies for Collective Computational Intelligence
%D 2011
%T Leveraging Massive User Contributions for Knowledge Extraction
%A Nikolopoulos, Spiros
%A Chatzilari, Elisavet
%A Giannakidou, Eirini
%A Symeon Papadopoulos
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Bessis, Nik
%E Xhafa, Fatos
%B Next Generation Data Technologies for Collective Computational Intelligence
%S Studies in Computational Intelligence
%I Springer
%V 352
%P 415-443
%@ 978-3-642-20343-5
%G eng

%0 Conference Paper
%B VS-GAMES
%D 2011
%T Towards a User-Aware Virtual Museum
%A Christos Zigkolis
%A Vassiliki A. Koutsonikola
%A Despoina Chatzakou
%A Karagiannidis, Savvas
%A Maria Giatsoglou
%A Kosmatopoulos, Andreas
%A Athena Vakali
%E Liarokapis, Fotis
%E Doulamis, Anastasios D.
%E Vescoukis, Vassilios
%K user groups
%K user preferences
%K virtual museum
%B VS-GAMES
%I IEEE Computer Society
%P 228-235
%@ 978-1-4577-0316-4
%G eng

%0 Conference Paper
%B ICT-GLOW
%D 2011
%T Utilization-Aware Redirection Policy in CDN: A Case for Energy Conservation
%A ul Islam, Saif
%A Stamos, Konstantinos
%A Pierson, Jean-Marc
%A Athena Vakali
%E Kranzlmller, Dieter
%E Tjoa, A Min
%K CDNs
%K Energy conservation
%K QoE
%X <p>Due to the gradual and rapid increase in Information andCommunication Technology (ICT) industry, it is very important to introduce energy efficient techniques and infrastructures in large scale distributed systems. Content Distribution Networks (CDNs) are one of these popular systems which try to make the contents closer to the widely dispersed Internet users. A Content Distribution Network provides its services by using a number of surrogate servers geographicallydistributed in the web. Surrogate servers have the copies of the original contents belonging to the origin server, depending on their storage capacity.When a client requests for some particular contents from a surrogateserver, either this request can be fulfilled directly by it or in case of absence of the requested contents, surrogate servers cooperate with eachother or with the origin server. In this paper, our focus is on the surrogate servers utilization and using it as a parameter to conserve energy in CDNs while trying to maintain an acceptable Quality of Experience (QoE).</p>
%B ICT-GLOW
%S Lecture Notes in Computer Science
%I Springer
%V 6868
%P 180-187
%@ 978-3-642-23446-0
%G eng

%0 Conference Paper
%B HT
%D 2010
%T Automatic extraction of structure, content and usage data statistics of web sites
%A Paparrizos, Ioannis K.
%A Vassiliki A. Koutsonikola
%A Angelis, Lefteris
%A Athena Vakali
%E Chignell, Mark H.
%E Toms, Elaine G.
%K classification
%K Crawling
%K Structure Content and Usage data
%K Web Mining Algorithm
%X <p>In this paper we present a web mining tool which automaticallyextracts the structure, content and usage data statistics of websites. This work inspired by the fact that web mining consists ofthree axes: web structure mining, web content mining and webusage mining. Each one of those axes is using the structure,content and usage data respectively. The scope is to use thedeveloped multi-thread web crawler as a tool to automaticallyextract from web pages data that are associated with each one ofthose three axes in order afterwards to compute several usefuldescriptive statistics and apply advanced mathematical andstatistical methods. A description of our system is provided aswell as some experimentation results.</p>
%B HT
%I ACM
%P 301-302
%@ 978-1-4503-0041-4
%G eng

%0 Journal Article
%J ACM Trans. Model. Comput. Simul.
%D 2010
%T CDNsim: A simulation tool for content distribution networks
%A Stamos, Konstantinos
%A Pallis, George
%A Athena Vakali
%A Katsaros, Dimitrios
%A Sidiropoulos, Antonis
%A Manolopoulos, Yannis
%K caching
%K Content Distribution Network
%K services
%K trace-driven simulation
%X <p>Content Distribution Networks (CDNs) have gained considerable attention in the past few years.As such, there is need for developing frameworks for carrying out CDN simulations. In this paper,we present a modeling and simulation framework for CDNs, called CDNsim. CDNsim hasbeen designated to provide a realistic simulation for CDNs, simulating the surrogate servers, theTCP/IP protocol and the main CDN functions. The main advantages of this tool are its high performance,its extensibility and its user interface which is used to configure its parameters. CDNsimprovides an automated environment for conducting experiments and extracting client, server andnetwork statistics. The purpose of CDNsim is to be used as a testbed for CDN evaluation andexperimentation. This is quite useful both for the research community (to experiment with newCDN data management techniques) and for CDN developers (to evaluate profits on prior certainCDN installations).</p>
%B ACM Trans. Model. Comput. Simul.
%V 20
%G eng

%0 Conference Paper
%B ACM Multimedia
%D 2010
%T ClustTour: city exploration by use of hybrid photo clustering
%A Symeon Papadopoulos
%A Christos Zigkolis
%A Kapiris, Stefanos
%A Yiannis Kompatsiaris
%A Athena Vakali
%E Bimbo, Alberto Del
%E Chang, Shih-Fu
%E Smeulders, Arnold W. M.
%K Clustering
%K event and landmark detection
%K tagging
%X <p>We present a technical demonstration of an online city explorationapplication that helps users identify interesting spotsin a city by use of photo clusters corresponding to landmarksand events. Our application, called ClustTour, is based onan efficient landmark and event detection scheme for taggedphoto collections. The proposed scheme relies on the combinationof a graph-based photo clustering algorithm, makinguse of both visual and tag information of photos, with acluster classification and merging module. ClustTour createsa map-based visualization of the identified photo clustersthat are classified in prominent categories and are filterableby time and tag. We believe that such an applicationcan greatly facilitate the task of knowing a city through itslandmarks and events. So far, the demo has been based on alarge photo dataset focused on Barcelona, and it is graduallyexpanding to contain photo clusters of several major cities ofEurope. Furthermore, an Android application is developedthat complements the web-based version of ClustTour.</p>
%B ACM Multimedia
%I ACM
%P 1617-1620
%@ 978-1-60558-933-6
%G eng

%0 Conference Paper
%B Panhellenic Conference on Informatics
%D 2010
%T Dynamic Code Generation for Cultural Content Management
%A Maria Giatsoglou
%A Vassiliki A. Koutsonikola
%A Stamos, Konstantinos
%A Athena Vakali
%A Christos Zigkolis
%B Panhellenic Conference on Informatics
%I IEEE Computer Society
%P 21-24
%@ 978-1-4244-7838-5
%G eng

%0 Journal Article
%J IJDWM
%D 2010
%T The Dynamics of Content Popularity in Social Media
%A Symeon Papadopoulos
%A Athena Vakali
%A Yiannis Kompatsiaris
%K Collaborative Technologies
%K Data Mining
%K Electronic Media
%K Online Behavior
%K Online Community
%K Resource Sharing
%K Web-Based Applications
%X <p>Social Bookmarking Systems (SBS) have been widely adopted in the last years, and thus they havehad a significant impact on the way that online content is accessed, read and rated. Until recently,the decision on what content to display in a publisherâ€™s web pages was made by one or at most fewauthorities. In contrast, modern SBS-based applications permit their users to submit their preferredcontent, to comment on and to rate the content of other users and establish social relations witheach other. In that way, the vision of the social media is realized, i.e. the online users collectivelydecide upon the interestingness of the available bookmarked content. This article attempts to provideinsights into the dynamics emerging from the process of content rating by the user community.To this end, the article proposes a framework for the study of the statistical properties of an SBS,the evolution of bookmarked content popularity and user activity in time, as well as the impact ofonline social networks on the content consumption behavior of individuals. The proposed analysisframework is applied to a large dataset collected from digg, a popular social media application.</p>
%B IJDWM
%V 6
%P 20-37
%G eng

%0 Conference Paper
%B WIAMIS
%D 2010
%T Exploring temporal aspects in user-tag co-clustering
%A Giannakidou, Eirini
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Yiannis Kompatsiaris
%X <p>Tagging environments have become an interesting topic ofresearch lately, focused mainly on clustering approaches, inorder to extract emergent patterns that are derived from tagsimilarity and involve tag relations or user interconnections.Apart from tag similarity, an interesting parameter to be analyzedduring the clustering/mining process in such data isthe actual time that each tagging activity occurred. Indeed,holding a temporal dimension unfolds macroscopic and microscopicviews of tagging, highlights links between objectsfor specific time periods and, in general, lets us observe howthe usersâ€™ tagging activity changes over time. In this article,we propose a time-aware user/tag clustering approach, whichgroups together similar users and tags that are very â€śactiveâ€ťduring the same time periods. Emphasis is given on usingvarying time scales, so that we distinguish between clustersthat are robust at many time scales and clusters that are somehowoccasional, i.e. they emerge, only at a specific time period.</p>
%B WIAMIS
%I IEEE
%P 1-4
%@ 978-88-905328-0-1
%G eng

%0 Conference Paper
%B Proceedings of the 12th international conference on Data warehousing and knowledge discovery
%D 2010
%T A graph-based clustering scheme for identifying related tags in folksonomies
%A Symeon Papadopoulos
%A Yiannis Kompatsiaris
%A Athena Vakali
%K community detection
%K folksonomies
%K graph-based clustering
%K tag recommendation
%X <p>The paper presents a novel scheme for graph-based clusteringwith the goal of identifying groups of related tags in folksonomies.The proposed scheme searches for core sets, i.e. groups of nodes thatare densely connected to each other by efficiently exploring the twodimensional core parameter space, and successively expands the identified cores by maximizing a local subgraph quality measure. We evaluate this scheme on three real-world tag networks by assessing the relatedness of same-cluster tags and by using tag clusters for tag recommendation. In addition, we compare our results to the ones derived from a baseline graph-based clustering method and from a popular modularity maximization clustering method.</p>
%B Proceedings of the 12th international conference on Data warehousing and knowledge discovery
%S DaWaK’10
%I Springer-Verlag
%C Berlin, Heidelberg
%P 65–76
%@ 3-642-15104-3, 978-3-642-15104-0
%G eng

%0 Conference Paper
%B ICIP
%D 2010
%T Image clustering through community detection on hybrid image similarity graphs
%A Symeon Papadopoulos
%A Christos Zigkolis
%A Tolias, Giorgos
%A Kalantidis, Yannis
%A Mylonas, Phivos
%A Yiannis Kompatsiaris
%A Athena Vakali
%K community detection
%K content-based image retrieval
%K image clustering
%K tags
%K visual similarity
%X <p>The wide adoption of photo sharing applications such as FlickrÂ°cand the massive amounts of user-generated content uploaded to themraises an information overload issue for users. An established technique to overcome such an overload is to cluster images into groups based on their similarity and then use the derived clusters to assistnavigation and browsing of the collection. In this paper, we presenta community detection (i.e. graph-based clustering) approach thatmakes use of both visual and tagging features of images in orderto efficiently extract groups of related images within large imagecollections. Based on experiments we conducted on a dataset comprising publicly available images from FlickrÂ°c, we demonstrate the efficiency of our method, the added value of combining visual andtag features and the utility of the derived clusters for exploring animage collection.</p>
%B ICIP
%I IEEE
%P 2353-2356
%@ 978-1-4244-7994-8
%G eng

%0 Journal Article
%J IEEE Trans. Knowl. Data Eng.
%D 2009
%T CDNs Content Outsourcing via Generalized Communities
%A Katsaros, Dimitrios
%A Pallis, George
%A Stamos, Konstantinos
%A Athena Vakali
%A Sidiropoulos, Antonis
%A Manolopoulos, Yannis
%K caching
%K content distribution networks
%K replication
%K social network analysis
%K web communities
%X <p>Content distribution networks (CDNs) balance costs and quality in services related to content delivery. Devising an efficientcontent outsourcing policy is crucial since, based on such policies, CDN providers can provide client-tailored content, improveperformance, and result in significant economical gains. Earlier content outsourcing approaches may often prove ineffective since theydrive prefetching decisions by assuming knowledge of content popularity statistics, which are not always available and are extremelyvolatile. This work addresses this issue, by proposing a novel self-adaptive technique under a CDN framework on which outsourcedcontent is identified with no a priori knowledge of (earlier) request statistics. This is employed by using a structure-based approachidentifying coherent clusters of â€ścorrelatedâ€ť Web server content objects, the so-called Web page communities. These communities arethe core outsourcing unit, and in this paper, a detailed simulation experimentation has shown that the proposed technique is robust andeffective in reducing user-perceived latency as compared with competing approaches, i.e., two communities-based approaches, Webcaching, and non-CDN.</p>
%B IEEE Trans. Knowl. Data Eng.
%V 21
%P 137-151
%G eng

%0 Journal Article
%J IEEE Internet Computing
%D 2009
%T Cloud Computing: Distributed Internet Computing for IT and Scientific Research
%A Dikaiakos, Marios D.
%A Katsaros, Dimitrios
%A Mehra, Pankaj
%A Pallis, George
%A Athena Vakali
%X <p>Cloud computing is a recent trend in informationtechnology and networking that has the potentialto change radically the way computer servicesare constructed, managed, and delivered. The key drivingforces behind the emergence of cloud computing includethe overcapacity of todayâ€™s large corporate data centers,the ubiquity of broadband and wireless networking, thefalling cost of storage, and progressive improvements innetworking technologies. Cloud computing opens new perspectiveswith profound implications in the area of communicationnetworks, raising new issues in their architecture,design, and implementation.</p>
%B IEEE Internet Computing
%V 13
%P 10-13
%G eng

%0 Conference Paper
%B WISE
%D 2009
%T Clustering of Social Tagging System Users: A Topic and Time Based Approach
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Giannakidou, Eirini
%A Yiannis Kompatsiaris
%E Vossen, Gottfried
%E Long, Darrell D. E.
%E Yu, Jeffrey Xu
%K Social tagging systems
%K time
%K topic
%K user clustering
%X <p>Under Social Tagging Systems, a typical Web 2.0 application,users label digital data sources by using freely chosen textual descriptions(tags). Mining tag information reveals the topic-domain ofusers interests and significantly contributes in a profile construction process.In this paper we propose a clustering framework which groups usersaccording to their preferred topics and the time locality of their taggingactivity. Experimental results demonstrate the efficiency of the proposedapproach which results in more enriched time-aware users profiles.</p>
%B WISE
%S Lecture Notes in Computer Science
%I Springer
%V 5802
%P 75-86
%@ 978-3-642-04408-3
%G eng

%0 Journal Article
%J I. J. Knowledge and Web Intelligence
%D 2009
%T A fuzzy bi-clustering approach to correlate web users and pages
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%K fuzzy bi-clustering
%K spectral analysis
%K web pages
%K web users
%X <p>With the rapid development of information technology, thesignificance of clustering in the process of delivering information to users isbecoming more eminent. Especially in the web information space, clusteringanalysis can prove particularly beneficial for a variety of applications such asweb personalisation and profiling, caching and prefetching and content deliverynetworks. In this paper, we propose a bi-clustering approach, which identifiesgroups of related web users and pages. The proposed approach is a three-stepprocess that relies on the principles of spectral clustering analysis and providesa fuzzy relation scheme for the revealed usersâ€™ and pagesâ€™ clusters. Experimentshave been conducted on both synthetic and real datasets to prove the proposedmethodâ€™s efficiency and reveal hidden knowledge.</p>
%B I. J. Knowledge and Web Intelligence
%V 1
%P 3-23
%G eng

%0 Journal Article
%J Neurocomputing
%D 2009
%T Fuzzy lattice reasoning (FLR) type neural computation for weighted graph partitioning
%A Kaburlasos, Vassilis G.
%A Moussiades, Lefteris
%A Athena Vakali
%K Clustering
%K Fuzzy lattices
%K Graph partitioning
%K Metric Measurable path
%K Similarity measure
%X <p>The fuzzy lattice reasoning (FLR) neural network was introduced lately based on an inclusion measurefunction. This work presents a novel FLR extension, namely agglomerative similarity measure FLR, orasmFLR for short, for clustering based on a similarity measure function, the latter (function) may also bebased on a metric. We demonstrate application in a metric space emerging from a weighted graphtowards partitioning it. The asmFLR compares favorably with four alternative graph-clusteringalgorithms from the literature in a series of computational experiments on artificial data. In addition,our work introduces a novel index for the quality of clustering, which (index) compares favorably withtwo popular indices from the literature.</p>
%B Neurocomputing
%V 72
%P 2121-2133
%G eng

%0 Unpublished Work
%D 2009
%T Leveraging Collective Intelligence through Community Detection in Tag Networks
%A Symeon Papadopoulos
%A Yiannis Kompatsiaris
%A Athena Vakali
%K collective intelligence
%K community detection
%K tag networks
%X <p>The paper studies the problem of community detectionin tag networks, i.e. networks consisting of associationsbetween tags that are used within Social Tagging Systems(STS) to annotate online resources (e.g. bookmarks,pictures, videos, etc.). Community detectionmethods aim at uncovering densely connected groupsof tags, which can reveal the topic structure emergingin the STS. In this way, community detection in tagnetworks leverages Collective Intelligence (CI), that isthe intelligence that is accumulated as a result of thecollective activities of masses of users.</p>
%G eng

%0 Conference Paper
%B BCI
%D 2009
%T Mining the Community Structure of a Web Site
%A Moussiades, Lefteris
%A Athena Vakali
%E Kefalas, Petros
%E Stamatis, Demosthenes
%E Douligeris, Christos
%B BCI
%I IEEE Computer Society
%P 239-244
%@ 978-0-7695-3783-2
%G eng

%0 Journal Article
%J IJWIS
%D 2009
%T A new approach to web users clustering and validation: a divergence-based scheme
%A Vassiliki A. Koutsonikola
%A Petridou, Sophia G.
%A Athena Vakali
%A Papadimitriou, Georgios I.
%K Cluster analysis
%K Internet Data mining
%K User studies
%X <p>Purpose â€“ Web usersâ€™ clustering is an important mining task since it contributes in identifying usagepatterns, a beneficial task for a wide range of applications that rely on the web. The purpose of thispaper is to examine the usage of Kullback-Leibler (KL) divergence, an information theoretic distance,as an alternative option for measuring distances in web users clustering.Design/methodology/approach â€“ KL-divergence is compared with other well-known distancemeasures and clustering results are evaluated using a criterion function, validity indices, andgraphical representations. Furthermore, the impact of noise (i.e. occasional or mistaken page visits) isevaluated, since it is imperative to assess whether a clustering process exhibits tolerance in noisyenvironments such as the web.Findings â€“ The proposed KL clustering approach is of similar performance when compared withother distance measures under both synthetic and real data workloads. Moreover, imposing extranoise on real data, the approach shows minimum deterioration among most of the other conventionaldistance measures.Practical implications â€“ The experimental results show that a probabilistic measure such asKL-divergence has proven to be quite efficient in noisy environments and thus constitute a goodalternative, the web users clustering problem.Originality/value â€“ This work is inspired by the usage of divergence in clustering of biological dataand it is introduced by the authors in the area of web clustering. According to the experimental resultspresented in this paper, KL-divergence can be considered as a good alternative for measuringdistances in noisy environments such as the web.</p>
%B IJWIS
%V 5
%P 348-371
%G eng

%0 Conference Paper
%B WAIM
%D 2008
%T Co-Clustering Tags and Social Data Sources
%A Giannakidou, Eirini
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Yiannis Kompatsiaris
%X <p>Under social tagging systems, a typical Web 2.0 application,users label digital data sources by using freely chosentextual descriptions (tags). Poor retrieval in the aforementionedsystems remains a major problem mostly due toquestionable tag validity and tag ambiguity. Earlier clusteringtechniques have shown limited improvements, since theywere based mostly on tag co-occurrences. In this paper,a co-clustering approach is employed, that exploits jointgroups of related tags and social data sources, in whichboth social and semantic aspects of tags are consideredsimultaneously. Experimental results demonstrate the effi-ciency and the beneficial outcome of the proposed approachin correlating relevant tags and resources.</p>
%B WAIM
%I IEEE
%P 317-324
%@ 978-0-7695-3185-4
%G eng

%0 Conference Paper
%B WISE
%D 2008
%T Correlating Time-Related Data Sources with Co-clustering
%A Vassiliki A. Koutsonikola
%A Petridou, Sophia G.
%A Athena Vakali
%A Hacid, Hakim
%A Benatallah, Boualem
%E Bailey, James
%E Maier, David
%E Schewe, Klaus-Dieter
%E Thalheim, Bernhard
%E Wang, Xiaoyang Sean
%X <p>A huge amount of data is circulated and collected every dayon a regular time basis. Given a pair of such datasets, it might be possibleto reveal hidden dependencies between them since the presence of the onedataset elements may influence the elements of the other dataset and viceversa. Furthermore, the impact of these relations may last during a periodinstead of the time point of their co-occurrence. Mining such relationsunder those assumptions is a challenging problem. In this paper, we studytwo time-related datasets whose elements are bilaterally affected overtime. We employ a co-clustering approach to identify groups of similarelements on the basis of two distinct criteria: the direction and durationof their impact. The proposed approach is evaluated using time-relatednews and stockâ€™s market real datasets.</p>
%B WISE
%S Lecture Notes in Computer Science
%I Springer
%V 5175
%P 264-279
%@ 978-3-540-85480-7
%G eng

%0 Journal Article
%J World Wide Web
%D 2008
%T Prefetching in Content Distribution Networks via Web Communities Identification and Outsourcing
%A Sidiropoulos, Antonis
%A Pallis, George
%A Katsaros, Dimitrios
%A Stamos, Konstantinos
%A Athena Vakali
%A Manolopoulos, Yannis
%B World Wide Web
%V 11
%P 39-70
%G eng

%0 Conference Paper
%B ICSC
%D 2008
%T SEMSOC: SEMantic, SOcial and Content-Based Clustering in Multimedia Collaborative Tagging Systems
%A Giannakidou, Eirini
%A Yiannis Kompatsiaris
%A Athena Vakali
%B ICSC
%I IEEE Computer Society
%P 128-135
%@ 978-0-7695-3279-0
%G eng

%0 Conference Paper
%B ISMIS
%D 2008
%T A Structure-Based Clustering on LDAP Directory Information
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Mpalasas, Antonios
%A Valavanis, Michael
%E An, Aijun
%E Matwin, Stan
%E Ras, Zbigniew W.
%E Slezak, Dominik
%X <p>LDAP directories have rapidly emerged as the essentialframework for storing a wide range of heterogeneous information undervarious applications and services. Increasing amounts of informationare being stored in LDAP directories imposing the need for efficientdata organization and retrieval. In this paper, we propose the LPAIR&amp; LMERGE (LP-LM) hierarchical agglomerative clustering algorithmfor improving LDAP data organization. LP-LM merges a pair of clustersat each step, considering the LD-vectors, which represent the entriesâ€™structure. The clustering-based LDAP data organization enhances LDAPserverâ€™s response times, under a specific query framework.</p>
%B ISMIS
%S Lecture Notes in Computer Science
%I Springer
%V 4994
%P 121-130
%@ 978-3-540-68122-9
%G eng

%0 Journal Article
%J IEEE Trans. Knowl. Data Eng.
%D 2008
%T Time-Aware Web Users’ Clustering
%A Petridou, Sophia G.
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Papadimitriou, Georgios I.
%B IEEE Trans. Knowl. Data Eng.
%V 20
%P 653-667
%G eng

%0 Conference Paper
%B ICCSA (2)
%D 2006
%T A Divergence-Oriented Approach for Web Users Clustering
%A Petridou, Sophia G.
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%A Papadimitriou, Georgios I.
%E Gavrilova, Marina L.
%E Gervasi, Osvaldo
%E Kumar, Vipin
%E Tan, Chih Jeng Kenneth
%E Taniar, David
%E LaganĂ , Antonio
%E Mun, Youngsong
%E Choo, Hyunseung
%X Clustering web users based on their access patterns is a quite significanttask in Web Usage Mining. Further to clustering it is important to evaluatethe resulted clusters in order to choose the best clustering for a particular framework.This paper examines the usage of Kullback-Leibler divergence, aninformation theoretic distance, in conjuction with the k-means clusteringalgorithm. It compares KL-divergence with other well known distance measures(Euclidean, Standardized Euclidean and Manhattan) and evaluates clusteringresults using both objective functionâ€™s value and Davies-Bouldin index.Since it is imperative to assess whether the results of a clustering process aresusceptible to noise, especially in noisy environments such as Web environment,our approach takes the impact of noise into account. The clusters obtainedwith KL approach seem to be superior to those obtained with the otherdistance measures in case our data have been corrupted by noise.
%B ICCSA (2)
%S Lecture Notes in Computer Science
%I Springer
%V 3981
%P 1229-1238
%@ 3-540-34072-6
%G eng

%0 Conference Paper
%B ICDE Workshops
%D 2006
%T Replication Based on Objects Load under a Content Distribution Network
%A Pallis, George
%A Stamos, Konstantinos
%A Athena Vakali
%A Katsaros, Dimitrios
%A Sidiropoulos, Antonis
%A Manolopoulos, Yannis
%E Barga, Roger S.
%E Zhou, Xiaofang
%B ICDE Workshops
%I IEEE Computer Society
%P 53
%G eng

%0 Conference Paper
%B ACSAC
%D 2005
%T Intrusion Detection in RBAC-administered Databases
%A Bertino, Elisa
%A Kamra, Ashish
%A Terzi, Evimaria
%A Athena Vakali
%X <p>A considerable effort has been recently devoted to thedevelopment of Database Management Systems (DBMS)which guarantee high assurance security and privacy. Animportant component of any strong security solution is representedby intrusion detection (ID) systems, able to detectanomalous behavior by applications and users. To date,however, there have been very few ID mechanisms specificallytailored to database systems. In this paper, we proposesuch a mechanism. The approach we propose to IDis based on mining database traces stored in log files. Theresult of the mining process is used to form user profilesthat can model normal behavior and identify intruders. Anadditional feature of our approach is that we couple ourmechanism with Role Based Access Control (RBAC). Undera RBAC system permissions are associated with roles, usuallygrouping several users, rather than with single users.Our ID system is able to determine role intruders, that is,individuals that while holding a specific role, have a behaviordifferent from the normal behavior of the role. Animportant advantage of providing an ID mechanism specifi-cally tailored to databases is that it can also be used to protectagainst insider threats. Furthermore, the use of rolesmakes our approach usable even for databases with largeuser population. Our preliminary experimental evaluationon both real and synthetic database traces show that ourmethods work well in practical situations.</p>
%B ACSAC
%I IEEE Computer Society
%P 170-182
%@ 0-7695-2461-3
%G eng

%0 Conference Paper
%B LA-WEB
%D 2005
%T A Latency-Based Object Placement Approach in Content Distribution Networks
%A Pallis, George
%A Athena Vakali
%A Stamos, Konstantinos
%A Sidiropoulos, Antonis
%A Katsaros, Dimitrios
%A Manolopoulos, Yannis
%B LA-WEB
%I IEEE Computer Society
%P 140-147
%@ 0-7695-2471-0
%G eng

%0 Book Section
%B Encyclopedia of Information Science and Technology (V)
%D 2005
%T Storage and Access Control Issues for XML Documents
%A Pallis, George
%A Stoupa, Konstantina
%A Athena Vakali
%E Khosrow-Pour, Mehdi
%B Encyclopedia of Information Science and Technology (V)
%I Idea Group
%P 2616-2621
%@ 1-59140-553-X
%G eng

%0 Journal Article
%J IEEE Internet Computing
%D 2004
%T LDAP: Framework, Practices, and Trends
%A Vassiliki A. Koutsonikola
%A Athena Vakali
%B IEEE Internet Computing
%V 8
%P 66-72
%G eng

